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Summary. — Principal component analysis is a statistical method, which lowers 
the number of important variables in a data set. The use of this method for the 
bursts' spectra and afterglows is discussed in this paper. The analysis indicates 
that three principal components are enough among the eight ones to describe the 
variablity of the data. The correlation between spectral index a and the redshift 
suggests that the thermal emission component becomes more dominant at larger 
redshifts. 
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1. Introduction 

The principal component analysis (PCA) is a powerful statistical method, if the ob- 
servational data contain several variables [1] . If there are some correlations among the 
variables, then PCA lowers the number of variables needed for the description of the 
dataset without any essential loss of generality. It introduces the quantities called prin- 
cipal components (PCs). Then one can omit some PCs not leading to essential loss of 
information about the dataset (for more details about PCA see PQ). 

In the topic of gamma-ray bursts (GRBs) the paper [2] used PCA on the dataset of 
B ATSE Catalog 3J . This Catalog contains nine measured variables in the gamma-band 
alone (three different peak fluxes, the four fluences in the four different energy channels, 
and the two durations) for any GRB. [2] has shown that from these nine variables only 
three ones are also enough (one of the durations, a combination of the fluences and the 
peak fluxes, and the fluence on the highest energy channel alone). 
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As far as it is known, since 1998 no further PCA was applied on any GRBs' datasets. 
This is remarkable, because for the dozens of GRBs further observational quantities 
are also known from the analyses of spectra and from the afterglow (AG) data. Thus, 
the usefulness of such generalized PCA is a triviality. This contribution presents the 
preliminary results of a new PCA, in which more variables than in [2] are included. 

In Jochen Greiner's list [4], up to September 30, 2004, there are 38 GRBs with known 
redshift z: for GRB980329 and GRB011030X there are upper limits only, and in one 
case z is queried (GRB980326). These 41 GRBs define the primary sample here. 

For every GRB in our sample we will consider 8 variables: z, a, /?, AGbreak, E pea k, 
AGa, Tgo and fluence. The last two variables are the usual Tgo durations and fluences 
used already in BATSE Catalog. The indices a and /3 are the slopes in Band's spectrum, 
Epeak defines the photon energy, where the Band spectrum EF(E) peaks (F(E)dE is 
the fluence from infinitesimal photon energy interval E, (E + dE)). These three variables 
define the spectrum of GRBs. AGfc rea fc is the observed time (in days), when the power- 
law decay of AG changes into a faster decay; the slope of the decay before this AGt rea fc is 
denoted as AGa. (This slope is usually denoted also as a, but using this notation there 
would be a confusion with the spectrum's slope a.) 

All this means that we have taken only two variables from BATSE Catalog (the 
Tgo duration and fluence), which in essence defined the two PCs in [2]. The third PC of 
BATSE Catalog, namely the fluence at the highest energy channel, will not be considered 
here, because for this channel the data are often vanishing and/or noisy. Our main aim 
here is to obtain informations and conclusions about the six remaining new quantities, 
not about the BATSE values themselves; hence, it is better not to consider here the 
fluence at the highest energy channel. 

There are 16 GRBs from the mentioned 41 GRBs that have all these variables in 
the Greiner's list. These 16 GRBs define the net sample for our PCA. For fluence, Tgo, 
AGfc rea/ t (days), 1 + z and E pea k we will take the logarithms of the measured values. 

2. Principal Component Analysis 

PCA needs, first, the calculation of correlations among the variables. The following 
pairs (with the corresponding probabilities p) give strong correlations: log T 90 - log(l + z) 
[p = 66.49%), log(l +z)-a(p = 99.75%), log Fluence - \ogT go (p = 95.49%), \ogE peak 

- log T 90 {p = 97.98%) and logE peak - log Fluence (p = 99.72%). All other correlations 
are weaker. 

The second step in PCA is to calculate the eigenvalues and eigenvectors. PCA gives 
us the eigenvalues and the eigenvectors of the "best" projection (Table 1).. 

Then the third step in PCA is to define the important PCs. In accordance with p] 
the important PCs are that ones, which have eigenvalues above 1.0. From Table 1 We 
obtain the result that only 3 PCs are important. This is a highly remarkable conclusion: 
we have 8 variables, and their reduction to 3 is an essential lowering. In addition, two 
PCs were expected already from the earlier paper [2J, because these two PCs were given 
by the BATSE data alone. In other words, the 6 new quantities - not used already in [2 

- define only one further new important PC. 

As it is seen from Table 1, the first three eigenvectors (Compl-3) are the linear 
combination of logE pea k, logT 90 , log Fluence, log(l + z), a, logAGh rea fc and AGa. No 
variable alone is dominating in these PCs. All this means that, beyond the two PCs 
coming from the BATSE Catalog, there is only one further important PC, and this 
one is a complicated linear combination of all variables. In other words: Although the 
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Table I. - PCA result. 





Compl 


Comp2 


Comp3 


Comp4 


Comp5 


Comp6 


Comp7 


log(l + z) 


0.345 


-0.359 


-0.399 


0.301 


0.278 


0.348 


-0.538 


a 


0.355 


-0.394 


-0.283 


0.336 


-0.392 


-0.188 


0.497 


P 


-0.342 


0.016 


0.394 


0.798 


0.226 


0.128 


0.098 


log AGbreak 


-0.225 


0.473 


-0.560 


0.023 


0.132 


0.135 


-0.013 


AGa 


0.381 


-0.142 


0.484 


-0.285 


0.217 


0.377 


0.051 


log E peak 


-0.427 


-0.285 


-0.145 


-0.173 


-0.218 


0.716 


0.313 


logT 90 


-0.316 


-0.498 


-0.154 


-0.212 


0.640 


-0.349 


0.213 


log Fluence 


-0.400 


-0.378 


0.120 


-0.055 


-0.442 


-0.181 


-0.555 


Eigenvalue 


4.314 


1.761 


1.224 


0.464 


0.189 


0.038 


0.009 



correlations among the variables are generally weak, these correlations are enough to 
ensure a coupling among the variables. 

The log-Epeafc+logT9o+ log Fluence correlation is well-known. This correlation in 
essence ensures that log E pea k alone is not independent. The correlation log(l + z) and 
a is also ensures that the redshift alone is also not independent - a quite remarkable 
behavior. (As it is seen on Figure 1, the positive correlation between log(l + z) and a 
is obvious, because the trend - increasing of z with increasing a - is well seen. Contrary 
this, no such trend is seen between z (3.) Similarly, the remaining correlations are enough 
to lower the number of important variables. 

Note still that, interestingly, (3 stands practically alone in fourth PC (Comp4). Hence, 
the only variable, which can define alone a PC, is the quantity (3. But, in accordance 
with [I], this PC has already an eigenvalue smaller than 1.0. 
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Fig. 1. - 1 + z distribution of the a and f3 values of the net data. 
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3. Discussion 

The importance of PCA is given also by the fact that it uses exclusively the observa- 
tional data, and any model of GRBs must respect the reduction of important variables. 
Here we briefly discuss the consequences of PCA on the models of GRBs. 

The prompt emission in GRBs can in most cases be described by various models of 
optically-thin synchrotron emission. Theoretical considerations suggest that, apart from 
this non-thermal emission, a thermal emission component, from the photosphere of the 
outflow, could be equally strong in the gamma-ray band [5J [5] . In [7] it was found that 
the observed spectra can indeed be described by a thermal black-body superimposed on 
a non-thermal spectrum. The relative strengths of these two components vary. A large 
a is measured if the thermal component is dominant, while a lower value (more like a 
synchrotron value) is found if the non-thermal component dominates. The variations in 
a therefore could suggest a variation in importance of the photosphere. 

Note that a values are measured on the integrated spectra. These are necessar- 
ily softer than the instantaneous spectra [8j. It is the instantaneous spectrum that 
reflects the physical radiation process shaping the spectrum. We therefore have that 
Oa-ad.proc. > &mt.- The strong correlation between a and l + z therefore suggests that the 
thermal component becomes more dominant at larger redshifts. In [5J it was showed that 
a combination of the variation time scale at the central engine and the dimension-less 
entropy of the outflow determines this relation. Low variability at the central engine 
combined with large Lorentz factors should therefore have been dominant at large red- 
shifts. [9] also found this a vs. l + z correlation. They explained it in terms of the 
Epeak oc E®^ d correlation, where E ra d is the total emitted energy of GRBs. 

4. Conclusions 

PCA method indicates that three PCs are enough to describe the variability of the 
data. Hence, a reduction of the eight variables into three is strongly suggested. The 
strong correlation between a and l + z suggests that the thermal emission component 
becomes more dominant at larger redshifts. 

* * * 
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